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HITTING TIME THEOREMS FOR RANDOM MATRICES 



LOUIGI ADDARIO-BERRY AND LAURA ESLAVA 



Abstract. Starting from an n-hy-n matrix of zeros, choose uniformly ran- 
dom zero entries and cliange them to ones, one-at-a-time, until the matrix 
$^ becomes invertible. We show that with probability tending to one as n — >■ oo, 

^1^ this occurs at the very moment the last zero row or zero column disappears. 

■^^ We prove a related result for random symmetric Bernoulli matrices, and give 

quantitative bounds for some related problems. These results extend earlier 
l/j work by Costello and Vu. [9]. 



1. Introduction 



In this paper, we initiate an investigation of hitting time theorems for random 

matrix processes. Hitting time theorems have their origins in the study of random 

2 graphs; we briefly review this history, then proceed to an overview of recent work 

^ on discrete and continuous random matrix models and a statement of our results. 

'""' To begin, consider the classical Erdos-Renyi graph process {Gn,p}Q<p<i, defined 

^__^ as follows. Independently for each pair {i,j} C [n] — {l,...,n}, let Uij be a 

J> Uniform[0; 1] random variable. Then, for p e [0, 1] let Gn^p have vertex set [n] 

On and edge set {{i,j} ■ Uij < p}. In Gn,p, each edge is independently present with 

t^^ probability p, and for p < p' we have that Gn,p is a subgraph of Gn^p' ■ 

*■ , Bollobas and Frieze [4] proved the following hitting time theorem for Gn,p, which 

is closely related to the main result of the present work. Let ts>i be the first time 
']^ p that Gn^p has minimum degree one, and let Tp^ be the first time p that G„_p 

!rS^ contains a perfect matching (or let p = 1 if ti is odd). Then as n — > cxd along even 



P {ts>i = Tpm) -> 1. 

In other words, the first moment that the trivial obstacle to perfect matchings (iso- 
lated vertices) disappears, with high probability a perfect matching appears. Ajtai, 
C^ Komlos and Szemeredi [1] had slightly earlier shown a hitting time theorem for 

Hamiltonicity; the first time G„^p has minimum degree two, with high probability 
Gn,p is Hamiltonian. In fact, [4] generalizes this, showing that if Ts>2k is the first 
time Gn,p has minimum degree 2fc and Tfe_Ham is the first time Gn.p contains k 
disjoint Hamilton cycles, then with high probability Ts^2k = ''"fe-Ham- Hitting time 
theorems have since been proved for a wide variety of other models and properties, 
including; connectivity [5], /c-edge-connectivity [3, 20] and fc- vertex-connectivity [3] 
in random graphs and in maker-breaker games; connectivity in geometric graphs 
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2 LOUIGI ADDARIO-BERRY AND LAURA ESLAVA 

[18]; and Hamiltonicity in geometric graphs and in the d-dimensional Gilbert model 

[2]. 

In this work we introduce the study of hitting time theorems for random discrete 
matrices. The study of random matrices is burgeoning, with major advances in 
our understanding over the last three to five years. Thanks to work by a host 
of researchers, the behaviour of the determinant [21], a wide range of spectral 
properties [11, 24], invertibility [6], condition numbers [22, 23], and singular values 
[19, 15] are now well (though not perfectly) understood. (This list of references 
is representative, rather than exhaustive.) The recent paper [13] provides a nice 
collection of open problems, with a focus on random discrete matrices. 

In order to have a concept of hitting time theorems for matrices, we need to 
consider matrix processes, and we focus on two such processes. The first is the 
Bernoulli process 7?.„ = {i?n,p}o<p<i: defined as follows. Independently for each 
ordered pair {ij : 1 < i 7^ j < n}, let Uij be a Uniform[0, 1] random variable. Then 
let Rn.p be an n-by-n matrix with i, j entry Rn^p{i,j) equal to one if Uij < p and 
zero otherwise. For n > 1 and < p < 1 we let 7J„.p be the directed graph with 
adjacency matrix Rn.p, so {Hn_p}o<p<i is a directed Erdos-Renyi graph process. 
(We take Rn.p to have zero diagonal entries as it is technically convenient for Hn^p 
to have no loop edges; however, all our results for this model would still hold 
if the diagonal entries were generated by independent uniform random variables 
{Uii : 1 < i < n}, and with essentially identical proofs.) 

The second model we consider is the symmetric Bernoulli process Q„ = {Qn,p}f)<p<i'- 
with Uij as above, ior 1 < i < j < n let Qn,p{i,.i) — Qn,p{j,i) = l[t/i <p]: ^'^'^ set 
all diagonal entries equal to zero. Throughout the paper, we work in a space in 
which Qn,p is the adjacency matrix of Gn,p for each < p < 1. The principal result 
of this paper is to prove hitting time theorems for invertibility (or full rank) for 
both the Bernoulli matrix process and the symmetric Bernoulli process; we now 
proceed to state our new contributions in detail. 

2. Statement of results 

Given a real-valued matrix M , write 

Z^^™{M) = {i : all entries in row i of Af equal zero}, 

define Z™^(M) similarly, and let z(M) = max([Z''°w(M)[, [Z^°^(M)[). Given a 
collection of matrices M = {Afp}o<p<i7 let t{M.) = m.i{p : z{Mp) = 0}, with 
the convention that inf0 = 1. We write r = r({Mp}o<p<i) when the matrix 
process under consideration is clear. We say that a square matrix M is singular 
if detM = 0, and otherwise say that M is non-singular. Our main result is the 
following. 

Theorem 2.1. As n -^ 00 we have 

P \Rn.T(TZn) *'5 non-singular^ — > 1 
P {Qn.T(Q„) 's non-singular^ — )• 1. 

In proving Theorem 2.1, we also obtain the following new result, which states 
that for a wide range of probabilities p, with high probability there are no non-trivial 
linear dependencies in the random matrix Rn.p- 
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Theorem 2.2. For any fixed c > 1/2, uniformly over p G {clnn/n, 1/2), we have 
P (rank(i?„,j,) =n- z{R^^p)) = 1 - 0{{\n\nn)'^/^) . 

The analogue of Theorem 2.2 for the symmetric process Q„ was estabhshed by 
Costello and Vu [9], and our analysis builds on theirs as well as that of [7]. The 
requirement that c > 1/2 in Theorem 2.2 is necessary, since for p = clnn/n with 
c < 1/2, with probability 1 — o(l) the matrix Rn^p will contain two identical rows 
each with a single non-zero entry, as well as two identical columns each with a single 
non-zero entry; in this case rank(i?„.p) < n — z(Rn^p). 

Our analyses of the processes TZn and Qn are similar, but each presents its own 
difficulties. In the former, lack of symmetry yields a larger number of potential 
"problematic configurations" to control; in the latter, symmetry reduces indepen- 
dence between the matrix entries. Where possible, we treat the two processes in a 
unified manner, but on occasion different proofs are required for the two models. 

There are two main challenges in proving Theorem 2.1. First, there are existing 
bounds of the form of Theorem 2.2 for the symmetric Bernoulli process [9]. However, 
in both models, r is of order Inn/n + 8(l/n). This is rather diffuse; it means that 
the moment when the last zero row/column disappears is spread over a region in 
which Q(n) ones appear in the matrix (Q{n) new edges appear in the associated 
graph). As such, a straightforward argument from Theorem 2.2 or its symmetric 
analogue, using a union bound, is impossible. This is not purely a weakness of 
our methods. Indeed, if the matrix contains two identical non-zero rows then it 
is singular, and the probability there are two such rows (each containing a single 
non-zero entry, say) when p < Inn/n is J7(ln n/n). This is already too large for a 
naive union bound to succeed. Moreover, with current techniques there seems no 
hope of replacing our bound by one that is even, say, 0(n'^) for any positive e, so 
another type of argument is needed. 

The second challenge is that invertibility is not an increasing property (adding 
ones to a zero-one matrix can destroy invertibility). All existing proofs of hitting 
time theorems for graphs (of which we are aware) use monotonicity, usually in the 
following way. An increasing graph property is a collection Q of graphs, closed under 
graph isomorphism, and such that ii G G Q and G is a subgraph of H, then H G Q. 
Suppose that Ji and K. are increasing graph properties with % (Z K,. If there is a 
function f(n,p) such that uniformly in < p < 1, 

P {Gnjin.p) en)=p + o(l) = P (G„,/(„,p) e /C) , 

then with probability 1 — o(l), the first hitting times of H and of /C coincide. This 
follows easily from the fact that H and /C are increasing. However, it breaks down 
for non-increasing properties and there is no obvious replacement. 

To get around these two issues, we introduce a method for decoupling the event 
of having full rank from the precise time the last zero row or column disappears. 
This method is most easily explained in graph terminology. We take a subset 
of the vertices of the graph under consideration, and replace their (random) out- 
and/or in-neighbourhoods with deterministic sets of neighbours. We prove results 
about the modified model, and then show that by a suitable averaging out, we can 
recover results about the original, fully random model. We believe the results about 
the partially deterministic models are independently interesting, and we now state 
them. 



(2) (iS'j )ig7+ and (S*- )j£i- are sequences of non-empty, pairwise disjoint suh- 
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Definition 2.3. Given n > 1, a template (or n-templatej is an ordered pair 
L = {C+,C^) = {{S:^)iei+,{Si)jei-), where 
(1) 1^,1^ are subsets of[n], 

sets of [n], 
(3) U.e/+ St C [n] \ I- and U,ei- ^7 ^ N \ 1+ ■ 
The size of £ is max(|/+|, |/^|,max(| 5*^^1,1 g /+),max(|S'~|, z G I~))- We write 
I = T{C) = (/+, /~). Also, we say C is symmetric i/£+ = C^ . Finally, for I € N, 
we let A^"(/) be the collection of n-templates of size at most I. 

We remark there is a unique template C of size zero, which satisfies I^ = % = I^ ; 
we call this template degenerate. 

Given n > 1, an undirected or directed graph G on n vertices and a template 
C as defined above, let G'~ be the graph obtained from G by letting each i e I^ 
have out-neighbours S^ (and no others) and each j £ I^ in-neighbours S~ (and 
no others). Note that if G is undirected and C is symmetric, then G^ is again 
undirected, provided we view a pair uv, vu of directed edges as a single undirected 
edge. We write Q^^p and i?^p for the adjacency matrix of G^ ^ and H^^. In Qn,p 
and -R^p, for i £ I~^ (resp. i G I^), the non-zero entries of row i are precisely those 
with indices in S^ (resp. in S^). 

Theorem 2.4. Fix K £ N and c > 1/2. For any p G (clnn/n, 1/2) and any 
template C£M"-{K), 

P (rank«^p) =n- z{RI^)) =. 1 - 0{{\n\nn)-'/^) . 

If, additionally, C is symmetric then 

P (rank(g^^p) = n - ziQi^)) = 1 - 0((lnlnn)-i/4) . 

We briefly remark on the assertions of the latter theorem. First, the first proba- 
bility bound immediately implies Theorem 2.2, by taking /"*" and /~ to be empty. 
Next, the condition of pairwise disjointness is necessary. To see this, note that 
if vertices u, v have degree one and have a common neighbour w then the rows 
of the adjacency matrix corresponding to u and w are identical, creating a non- 
degenerate linear relation. Finally, in proving the theorem we in fact only require 
that UK — maxjg/+ \St\, then IJ+I • K = o{p~^ /{n\nn)) (and similiarly for the 
maximum size of S~). As we do not believe this condition is optimal we have opted 
for a more easily stated theorem. However, it would be interesting to know how far 
the boundedness condition could be weakened. 

The proof of Theorem 2.4 is based on an analysis of an iterative exposure of 
minors. In brief, we first show that for suitably chosen n' < n, a fixed n'-hy-n' 
minor of Rn,p is likely to have nearly full rank. Adding the missing rows and 
columns one-at-a-time, we then show that any remaining dependencies are likely to 
be "resolved" , i.e. eliminated, by the added rows. Our argument is similar to that 
appearing in [9] for the study of Qn,p, but there are added complications due to the 
fact that our matrices are partially deterministic on the one hand, and asymmetric 
on the other. The proof of Theorem 2.4 occupies a substantial part of the paper; a 
somewhat more detailed sketch appears in Section 4. 

Vershynin [25] has very recently strengthened the bounds of Costello, Tao and 
Vu [7], showing that for a broad range of dense symmetric random matrices. 
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the singularity probability decays at least as quickly as e~" , for some (model- 
dependent) /? > 0. It seems plausible (though not certain) that Vershynin's tech- 
niques could be transferred to the current, sparse setting, to yield bounds of the 
form 1 — (9(e~('"'"") ) in Theorems 2.2 and 2.4. However, we believe, and the 
results of [8] suggest, that aside from zero-rows, the most likely cause of singularity 
is two identical rows, each containing a single one. If the latter is correct then it 
should in fact be possible to obtain bounds of the form 1 — 0(n^^^'^) (for c > 1/2 
as above); as alluded to earlier, for the moment such bounds seem out of reach. 

Notation 

Before proceeding with details, we briefly pause to introduce some terminology. 
Given an m x to matrix M — (TO,ij)i<i ,,■<„, the deficiency of M is the quantity 

Y{M) := m - rank{M) - z{M). 

Also, for any i,j £ [to.] we denote by M*^*'-'^ the matrix obtained by removing 
the i-th row of M and the j-th column; we refer to M^*'-'^ as the (i, j) minor of M. 
More generally, given A,B d [to,] we write Af (-^'^^ for the matrix obtained from 
M by removing the rows indexed by A and the columns indexed by B. Also, for 
1 < fc < TO we write M[k] = (Mjj)i<ij<fe. 

For a graph G = (V, E) and v gV, we write Nq{v) for the set of out-neighbours 
of u in G and Nq{v) for the set of in-neighbours of v in G. (If G is undirected then 
we write Nq(v) = Nq{v) = Nq{v) for the set of neighbours of G.) If M is the 
adjacency matrix of G then for 1 < i < to we write A'^^(i) = A^^t (i), and similarly 
for N^j{i) (and A^a/(*) if M is symmetric). Note that in this case, Z^°'*'{M) and 
Z™^(M) correspond to the sets {we V : \N+{v)\ = 0} and {w G V : |A^-(t>)| = 0}, 
respectively. 

Given real random variables X and Y we write X ^st Y ii for all t G M, 
P {X > t) < F (Y > t), and in this case say that Y stochastically dominates X. 
Finally, we omit floors and ceilings for readability whenever possible. 

Outline 

The structure of the remainder of the paper is as follows. In Section 3 we explain 
how to "decouple" the linear dependencies of the matrix process from the time at 
which the last zero row or zero column disappears. This decoupling allows us to 
prove Theorem 2.1 assuming that Theorem 2.4 holds. 

In Section 4 we introduce the iterative exposure of minors, analogous to the ver- 
tex exposure martingale for random graphs, which we use to study how Y(Rn^p[m\) 
changes as to increases. We then state a key "coupling" lemma (Lemma 4.3), 
which asserts that for some n' < n with n ~ n' sufficiently small, the process 
{Y {Rn^p[m\) , n' < m < n) \s stochastically dominated by a reflected simple random 
walk with strongly negative drift. Postponing the proof of this lemma, we then 
show how Theorem 2.4 follows by standard simple hitting probability estimates for 
simple random walk. 

In Section 5, we describe "good" structural properties, somewhat analogous to 
graph expansion, that we wish for the matrices Rn,p\m] to possess. The properties 
we require are tailored to allow us to apply linear and quadratic Littlewood-Offord 
bounds on the concentration of random sums. Proposition 5.10, whose proof is 
postponed, states that these properties hold with high probability throughout the 
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iterative exposure of minors. Assuming the properties hold, it is then a straight- 
forward matter to complete the proof of Lemma 4.3. 

In Section 6 we complete the proof of Proposition 5.10. This, the most technical 
part of the proof, is most easily described in the language of random graphs rather 
than of random matrices. It is largely based on establishing suitable expansion 
and intersection properties of the neighbourhoods that hold for all "small" sets of 
vertices and in each of the graphs i?„.p[r7i] considered in the iterative exposure of 
minors. 

Finally, Appendix A states standard binomial tail bounds that we use in our 
proofs, and Appendix B contains proofs of results that are either basic but technical, 
or that essentially follow from previous work but do not directly apply in our setting. 

3. Decoupling connectivity from rank estimates: the proof of 
Theorem 2.1 from Theorem 2.4 

In this section we explain how the first assertion of Theorem 2.1 follows from the 
first assertion of Theorem 2.4. A similar argument applies to the second assertion 
of Theorem 2.1; we comment on the necessary adjustments to the argument in 
Section 3.1. 

Recall that the process 7?,„ is generated by a family {Uij '■ I < i y^ j < n} oi 
independent Uniform[0, 1] random variables. Given /+, I^ C [n], let I = (/+,/^) 
and write 

Ji = a{{U^J ; i e /+ or j e /"}), 

Gx = <ji{U,j : ie[n]\I+ and j G [n] \ J"}). 

Informally, J-x contains all information that can be determined from the process 
TZn by observing only rows with indices in /+ and columns with indices in /~ . All 
information about all remaining entries is contained in Qx- 
Next, given p S (0, 1), let 

and let 

Blip) = {I^ C Z--(i?„,p),/- C Z™^(i?„,p)} . 

In words, Ax{p) is the event that the matrix obtained from Rn.p by deleting the 
rows indexed by /"*" and the columns indexed by /~ has neither zero rows nor zero 
columns. We remark that Ax{p) and Bx{p) are measurable with respect to Qx and 
Tx, respectively, and that Ax{p) n Bx{p) is precisely the event that Z^°'*'{Rn^p) = 
J+ and Z°°^(i?„^p) == /". We write Ax,Bx instead of Ax{p),Bx(jp) when the 
dependence on p is clear from context. 
Finally, let 

Tx = Txijln) = min{p e (0, 1) : Z"™(^n,p) H /+ = = Z™^(i?„,p) n /-} . 

Then, for any template C = [{S^)it^i+, {Sj)j<zi-), let 

C£ - {V z e /+, 7V+^ ^ (*) = St) n {V z e r,N^^^ ^ U) = S;} . 

Observe that Cn and tj are measurable with respect to Tx- Furthermore, the 
entries that are random in i?^ are precisely those corresponding to the random 
variables generated by Qx- Lemma 3.1, below, uses this fact in order to express 
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the conditional distribution of t given Ax,Bx, and Cc, as an integral against a 
conditional density function. 

Lemma 3.1. Fix 1 < I < n and p e (0,1). Then for any non- degenerate n- 
template C = {{Sf)i^i+ ,{SY)n^i-), letting I — (1^,1^) and writing r = r(7?,„), 
we have 

P(y(i?„,,) = \Ai{p),Bi{p),Cc) 

= / P (r(i?^,,) - I Aiip)) fit I Blip), Cc)dt, 
Jo 

where /(• | Bxip),Cc) is the conditional density of tj given Bxip) and Cc- 

In proving Lemma 3.1 we use the following basic observation. Given independent 
CT-algebras Fi and F2 , for every Ei,Fi e J^i and £^2,-^2 G J^2, we have 

P iEi,E2 I ^1, F2) = P (£;i I Fi) P (^2 I F2) . 

Proof. If Ax,Bx and C£ all occur, we necessarily have rankiRn^r) — fo-'^k{Rn t) 
and Tx = T. For any iiT <E N, we may thus rewrite P (y(i?„_7-) = | ^x, -Bj, C^) as 

if-i 

5] P (y«,.J = 0,ri e [i, ^) I Ax,Bx,Cc) ■ 

i=0 

For any 0<i<K,iij^<Tx< ^^ and no edges arrive in the interval (rx, ^^], 
then i?^ ^ and R'-' i+i are identical. Writing D for the event that a pair of distinct 

edges arrive within one of the intervals {[^, ^], < i < K}, it follows that 

p(y(i?„,0 = o I Ai,Bx,C£) 

- E P(^(<^)'^^ ^ [i'¥) I Ax,Bx,Cc 

For fixed edges e and e', we have P (|C/e — t^e'l — 1^) ^ ;^- By a union bound it 
follows that P iD) < "^^~ ' , and so the final term in (1) tends to as X — > 00. 

Finally, by the observation about conditional independence just before the start 
of the proof, for all K we have 

K-l 



Y, P (r(i?f,H^),Tx G [i, ¥) I Ax,Bx,Cc 



1=0 
K-l 



j=0 

and taking K ^f co completes the proof. D 

The next definition captures the event that, for a given p < t, the rows (resp. 
columns) of Rn^r indexed by Z'^°*'(i?„_p) (resp. Z'^°^(i?„^p)) are such that Theo- 
rem 2.4 can be applied. 
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Definition 3.2. Given < p < p' < I and K E N, let VxiPtP') be the event 
that {{N]^ , (*))ie2''™'(-R„ )t{-^r , (*))iez™^(_R„ )) is a non- degenerate n-template 
of size at most K . 

Lemma 3.3. For any e > there exists a > and integer K — K{a) > such 
that, setting pi — and p2 = ^"^° , for all n sufficiently large, 

P(r(7^„)>p2)<e, 
P{VK{pi,T{TZn)))>l-e. 

The proof of Lemma 3.3 is straightforward but technical, and is presented in 
Appendix B. 

Now recall the definition of Y{M) from the notation section on page 5. Ob- 
serve that the first claim of Theorem 2.1 is equivalent to the statement that with 
high probability Y{Rn,r{n„)) — 0- Therefore, to establish the first assertion of 
Theorem 2.1, it suffices to prove the following theorem. 

Theorem 3.4. P (y(i?^„_^(K^)) = O) ^ 1 as n ^ oo. 

Proof. Fix e > 0, let a > and K = K{a) > be as in Lemma 3.3. Throughout the 
proof write pi = ^-^^^, P2 = ^^^^, and r = r(7^„). Note that Vk{pi,t) occurs 
precisely if there exists a non-degenerate £ e M^^{K) such that Ax{c) {Pi)iBx(c) (Pi): 
and Cc all occur. Furthermore, if £ 7^ £' then Ax{c){Pi) ^ Bx(c){Pi) ^ Cc and 
Ax{c'){Pi)^ Bx{a){pi)r\C a are disjoint events. Writing 7W"(i^) = {£ G M'^iK) : 
C is non-degenerate}, it follows that 

p (y(i?„,.) - 0) 

> V{Y{Rn.r)=i\VK{pi,T)) 

Y, P(i"(i?«.r)=0,Ax,Bx,C£) 

We win show that for any C e M^\K), 

P{Y{R,,^r)^0,Ax,Bx,Cc) ^, ,^. .„. 

— ^7-, — ^—^ ^ >l-ol). (2) 

P[Ax,Bx,Cc,T <p2) 

Assuming this, it follows that 

p (r(i?„,,) - 0) 

> (l-o(l)) J2 PiAx,Bx,Cc,r<p2) 

C<£M"{K) 

= {1-o{1))P{Vk{pi,t),t<P2) 

> l-2e 

for n large, the last inequality by Lemma 3.3. Since e > was arbitrary, it thus 
remains to prove (2), for which we use Lemma 3.1. 

Fix C e M'^{K) and \el N ^ n ~ \I+\> n ~ K. Then pi = i^i^ = ^2^^ + 
0(-^). For any fixed integer i >1 and distinct ui, . . . , u^ G [n] \ /+, 

p I n {N+c^^ {v,) = 0} J ^ (1 - p,)'(^-)+^(-i) = (1 + o(i))(i _ p,f-^, 
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and it follows by the method of moments (see [12], Chapter 6) that |Z"°*(i?^ )| is 
asymptotically Poisson(e"). The same argument establishes that |Z°°^(i?^ )| has 
the same asymptotic distribution. It follows that P (Ax) < 2e^'^ + o(l), so by the 
first assertion of Theorem 2.4, for p E {pi,p2) we have 

Since a — 0(1), by Lemma 3.1 we thus have 

PiY{R^^r)=0\Ax,Bi,Cc) 

> / p (r«,) = |Ax) fit I Bx, Cc)dt, , 

rP2 

>(l-o(l))/ fit\Bx,Cc)dt, 
Jpi 

-(l-o(l))P(Tie[pi,p2] \Bx,Cc). 

Multiply both sides of the preceding inequality by P {Ax,Bx,Cc)- Since Ax is 
independent from Bx and Cc, we obtain 

P{Y{R,,,r)=0,Ax,Bx,Cc) 

> il-o{l))P{Ax,Bx,Cc,Txe[pi,P2]) 

Finally, if Ax, Bx, and Cc all occur then necessarily tx = t and r > pi, so we may 
replace {tx S [pi,P2]} by {t < P2} in the final probabihty, and (2) follows. The 
proof is complete. D 

3.1. Notes on the proof of Theorem 2.1 for Gn,p- The decoupling of connec- 
tivity from the rank estimates of Rn,p is not extremely sensitive to the structure of 
Rn,p except through Theorem 2.4, and the broad strokes of the argument of this 
section are therefore unchanged. In particular, define Tx and Gi as before (but 
recall that only the variables {C/ij}i<i<j<ri are independent). Then Lemma 3.1 
holds under the additional restriction that the template C is symmetric, as in this 
case /+ — I~ and the cr-algebras Tx and Gx are indeed symmetric. We replace the 
event Dk{p,p') with the event that ((iVg^ ^, (i))jg2(Q^^), {NQ^^^,{i)),(zz{Q„^^)) is a 
non-degenerate n-template of size at most K (in which case it is necessarily sym- 
metric). Lemma 3.3 then holds with t (??,„) replaced by T(Qn), with an essentially 
identical proof. Assuming the second bound in Theorem 2.4, the rest of the proof 
then follows without substantial changes. 

4. Analysis of the iterative exposure process: the proof of 
Theorem 2.4 modulo a coupling lemma 

To prove Theorem 2.4, we analyze an iterative exposure of minors of the matrix. 
(In other words, we will expose the edges incident to the vertices of Hfj; in a 
vertex-by-vertex fashion.) This strategy was first used in the context of random 
symmetric matrices in [7], to show that random symmetric Bernoulli(l/2) matrices 
are almost surely non-singular. 

For the remainder of the paper, c € (1/2, 1) is a fixed constant, and clnn/n < 
p < 1/2. Also, for the rest of the paper, let a be such that ac S (1/2,3/4), write 
7 = ac — 1/2, and let n' — \an\ . Given any integer K > 1, for n sufficiently large 
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and any n-template £ — {{S^)i^i+, {S^)i(zi-) G M"{K) by permuting the rows 
and columns of R^ „ we may assume that 

/+ U /- U U 5+ U U Sr c [n'] ; 
iei+ iei- 

we call such C € M'^{K) permissible. We work only with permissible templates 
to ensure that in the iterative exposure of minors in i?^ starting from R^ \n'] , 
all new off-diagonal matrix entries whose row (resp. column) is not in UiG/+ ^t 
(resp. Uie/- ^7) ^''^ BernouUi(p) distributed. We begin by showing that R^.p[n'] 
is extremely likely to have quite large rank. 

Lemma 4.1. For any e > and clnn/n < p < 1/2, there exists a constant c\ 
such that 

P (rank(i?„,p) > (1 - e)n) > 1 - ©(n^'i"). 

The proof of an analogous bound for random symmetric sparse Bernoulli matrices 
appears in [9]. 

Proof. Denote the rows of Rn.p by ri, . . . , r„. Let S = span(ri, . . . , rL(i_£)„j ) and 
d — dimS*. Let R be the event that r^ G S for every [(1 — e)n\ < i < n. By 
symmetry we have 

P (rank(i?„,p) < (1 - e)n) < f '^ ) P (i?) , (4) 

\enj 

so we now focus on bounding P (R). To do so, relabel the columns, to express Rn,p 
as a block matrix 

R -( A B 
"'P ~\CD 

where A is an [(1 — e)n\ x d matrix with rank(yl) — d. Thus, the columns in B are 
in the span of the columns in A, so AG — B for some (unique) matrix G. 

On the other hand, D is an \en'\ x {n — d) matrix and d < en. It follows that 
there exists C2 > such that for any fixed matrix M, 

P (L> = M) < (1 - p)(^")'-^" < g-£'™l„n+en/2 ^ ^-c^n^ 

The first inequality holds since D has at least (en)^ — en independent Bernoulli(p) 
entries. 

Now, if R holds, then there exists a matrix F such that both FA — C and 
FB = D hold. Furthermore, note that if F' also satisfies F'A = C then 

FB ^ FAG = F'AG = F'B, 

so if R occurs then D = FB is uniquely determined by A, B and C. Consequently, 
for any such F we have 

P{R\ A,B,C)<P{D^FB \ A, B, C) < n'"^"" . 

Since P (i?) = E [P {R \ A, B, C)], by (4) it follows that 



P (rank(i?„,p) < (1 - e)n) < ( "Jp (^) < (^) 



— C2n 
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In fact, Lemma 4.1 also holds for R^ [n'] as well. To see this, observe that since 



C has size at most K, |rank(i?„ [n']) — rank(i?„.p[n'])| < K ~ 0(1 

has the same distribution as Rn'.p- We thus obtain the following corollary. 



"-".PL 



Corollary 4.2. For any K d N and e > 0, there exists a constant ci such that 
uniformly over C G M"'{K) and clnn/n < p < 1/2, 

P (rank(i?f p[n']) > (1 - e)n) > 1 - ©(n-'^i"). 

We next consider how the deficiency Y{R^ [m]) drops as m increases from n' 
to n. We have 

Y{R^Jm+l]) = Y{Rl^[m]) + l 

+ {<RipH) - z{Rlp[m + I])) 

- (rank«_p[m + 1]) - rank«_p[m])), (5) 

so Y decreases as the rank increases and, on the other hand, increases when zero 
rows or zero columns disappear causing that z{R^p[m]) > z{R^p[m + 1]). To show 
that Y{R^p) is likely zero, we will couple {Y{R^p[m]),n' < m < n) to a. simple 
random walk with strongly negative drift, in such a way that with high probability 
the random walk provides an upper bound for {Y{R^p[m]), n' < m < n). Of course, 
showing that such a coupling exists involves control on the rank increase and on 
the decrease in z{R^ [m]) as m increases from n' to n. Observe that we always 
have rank(i?^_p[m]) < rank(i?^p[TO+ 1]) < rank(i?fjp[m]) + 2 since R^p[m] may be 
obtained from R^ [m+ 1] by deleting a single row and column. It follows from (5) 
that if z(i?^_pH) = ^(^^,p[m + l]) then r(i?^_p[TO+i])-y(i?^^pH)e {-1,0,1}. 
Also, if z{R^p[m]) = z{R^p[m + 1]) — 1 then necessarily rank(i?^_p[TO + 1]) > 
rank(i?^p[m]) + l and so F(i?^p[m+l])-y(i?^p[m]) € {0, 1}. Together, this shows 
that \Y{R^^p[m + l])-Y{R^Jm])\ < 1 whenever ziR^Jm])-z{R^Jm+l]) < 1, 
Establishing further control on the rank increase is rather involved, and is the 
primary work of Sections 5 and 6. It will turn out that typically, Y{R^ [m + 
1]) - Y{R^p[m]) = -1 when Y{R^p[m]) > 0, and Y{R^p[m + 1]) = when 
Y{R^ [m]) = 0. More precisely, we have the following lemma. 

Lemma 4.3. For fixed K £ N, there exists C > such that the following holds. 
Given integer n > 10, let /3 = /3{n) — C(lnlnn)^^''^. Then uniformly over C £ 
A4"{K) and c\nn/n < p < 1/2, there exists a coupling of{Y{R^[m]), n' < m < n) 
and a collection [Xj^^n' < m < n) of iid random variables with P (Xi = 1) = /3 and 
P (Xi = —1) = 1-/3, such that with probability 1 — 0{n '"' ^), for all n' < m < n, 



Y{Ri\m + l])-Y{R 




,0) ifY{Rlp[m])^0. 



"^P^ ^' ^ "'P^ J^-lrvno^i'-r (\\ ifviTyC 



The proof of Lemma 4.3 occupies much of the remainder of the paper. We say the 
coupling in the preceding lemma succeeds if for all n' < m < n, the final inequality 
holds. 

Now fix {Xi, i > 1) iid random variables with P {Xi = I) ~ f3 and P {Xi = — 1) = 
1 - /3. Set 5*0 = 0, and for fc > 1 let 5"^, = E,*=i^*- We call (5fc,fc > 0) a /3- 
biased simple random walk (SRW). Also, for fc > 1 let M^ = mino<i<fc Si, and let 
Dk^Sk-Mf,. 
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Observe that when Sk is not at a new global minimum, Dk+i is either Dk + 1 
(with probability /?) or D^ — 1. On the other hand, when S^ is at a new minimum 
then Dk = 0, and either D^+i = 1 (again with probability /3) or Dk+i = 0. 

Now imagine for a moment that Y{R^ Jn']) = 0. In this case, in view of the 
preceding paragraph, if the coupling succeeds then we have Dk > Y{R^p[n' + k]) 
for all < k < n—n'. It follows that ii Y {R^ [n']) happens to be equal zero then we 
can bound P (Y{R^.p[n]) > O) by bounding P (£)„_„/ > 0). This is accomplished 
by the following proposition and its corollary. 

Proposition 4.4. Let H ::^ \{k > : Sk > 1}|. T/ien E (iJ) = /3/(l - /3)2. 

Proof. This is an elementary fact about hitting times for simple random walk, and 
in particular follows from Examples 1.3.3 and 1.4.3 of [16]. D 

Corollary 4.5. For fc > 0, P (£»& > 0) < /3/(l - (3)'^. 

Proof. For any i < k we have 

P (Dk > 0, S, = min sA < P {Sk-^ > 1) , 
V i<j<fc / 

and summing over i, plus a union bound, yields 

p{Dk>o)<Y.p{s,>i) = E{H) . a 

i>0 

In reality, Y{R^ [n']) may not equal zero, and so we should start the random 
walk S not from zero but from a positive height. The following corollary addresses 
this. 

Corollary 4.6. For any integers d, fc > 1, 

P{Sk + d> min(Affe + d, 0)) <P{Sk> -d) + /3/(l - I3f . 

Proof. Let r = inf{i : Si — ~d}. By the Markov property, for any i < k, 
F{Sk + d> min(Affe + d, 0)|r = z) = P {Dk-i > 0) < /3/(l - /S)^ by the preceding 
corollary. On the other hand, P (t > /c) < P {Sk > —d), and the result follows. D 

On the other hand, if i > Y{R^^p[n']) then when the coupling succeeds we have 
Sk+t- min(Mfc + t, 0) > Y{R^p[n' + k]) for all < fc < n - n'. It then follows 
from Lemma 4.3 and Corollary 4.6 that for any e > 0, 

P (YiRip) > 0) <P (r«pK]) >en)+P (5„_„, > -en) + j^^^ + O (n-'/') 

<n-^(") + e-"(") + -^ + O (n-^/2) 

=0((lnlnn)-i/2) 

where the second inequality follows from a Chernoff bound for P (S'„_„' > —en) 
(assuming s is chosen small enough), plus the bound from Corollary 4.2, and the 
last inequality follows from the definition of /3 in Lemma 4.3. This proves the first 
assertion of Theorem 2.4. 

When treating the symmetric model Q^pi the following modifications are re- 
quired. First, Corollary 4.2 holds for all symmetric n-templates C E M'^{K) and 
with Qn^pin'] in place of Rn,p[n']. This was proved in [9] for Qn,p[n'], but as £ has 
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size K, \rank{Qn,p[n']) — rank((5^p[n'])| < K^ — 0(1), so the same bound holds 
for Q^p[n']. 

Second, we wiU hkewise estabhsh a couphng lemma for Y{Q^ [m]). 

Lemma 4.7. For fixed K G N. there exists C > such that the following holds. 
Given integer n > 10, let /3 — I3{n) = C(lnlnn)~^'^. Then uniformly over symmet- 
ric C G Ai"'{K) andc\nn/n <p< 1/2, there exists a coupling of {Y {Q^p[m]) , n' < 
m < n) and a collection (Xm, n' < m < n) of iid random variables with P {Xi = 1) = 
/3 and P [Xi = —1) = 1-/3, such that with probability 1 — 0(n~'''"), for all 
nl < m < n, 



Y{{Qi^^[m + l])~Y{Qt^[m\)< 



X^ j/r(Q^,pH)>0 

max(X„,0) i/r(Q^,pH) = 0. 



Together, these two ingredients yield the second claim of Theorem 2.4 by a reprise 
of the arguments following Lemma 4.3 The remainder of the paper is therefore 
devoted to proving Lemmas 4.3 and 4.7. 

5. Rank increase via iterative exposure 

In this section we focus on understanding when and why the rank increases. 
In what follows, fix an m x to matrix Q — {qi,j)i<ij<m- Given vectors x — 
{xi,...,x.m), y = (j/i,...,2/„i), we write 

fqis ■■■ 91, m yi\ 



r(g,x,y) = 



qm,i 



Qm,m 



Now fix a matrix Q, and vectors x = (xi, . 



0/ 

-iyi 



We remark 



that rank(r((5, x, y)) = rank(Q) + 2 if and only if x is linearly independent of the 
non-zero rows of Q (i.e. it does not lie in the row-span of Q) and y^ is linearly 
independent of the non-zero columns of Q. (In particular, if Q is symmetric and 
X = y then rank(r(Q, x, y)) = rank((3) -I- 2 if and only if x lies outside the row-span 
of Q.) Note that for this to occur Q can not have full rank. 

We prove Lemmas 4.3 and 4.7 as follows. First, we describe structural properties 
of — 1 matrices such that for any matrix Q = {qij)i<ij<m satisfying such proper- 
ties, for suitable random vectors x and y, with high probability rank(r((5,x, y)) = 
min(rank((5) -)- 2, to -I- 1) . We then establish that with high probability, the matrices 
((5^p[to.], n' < m < n) and {R^p[m],n' < m < n) all have the requisite properties. 

More precisely, for fixed Q and vectors x, y, we will see that rank(r((5,x,y)) — 
min(rank((5) + 2, to, + 1) if and only if a suitable linear, bilinear or quadratic form 
in x and y, with coefficients determined by the matrix Q, vanishes; we elaborate 
on this very shortly. When x and y are Bernoulli random vectors, this leads us to 
evaluate the probability that a particular random sum is equal to zero. To bound 
such probabilities, we use Littlewood-Offord bounds proved in [9], [10], which we now 
state. 



Proposition 5.1. 

variables. 



Let xi, . . . ,Xk,yi, ■ ■ . ,yk be independent Bernoulli{p) random 



14 LOUIGI ADDARIO-BERRY AND LAURA ESLAVA 

(a) Fix oi, . . . , afc e M \ {0}. Then uniformly over < p < 1/2, 
suppf Va.x, -r) ^ Oiikp)-'/^) . 



rel 



(b) Fix I > 1 and {aij, 1 < i, j < k) such that there are at least I indices j for 
which \{i : aij ^ 0}\ > I. Then uniformly over < p < 1/2, 

supP V a,jx,yj =n = 0{{lp)~^'^) ■ 

(c) With I > 1 and {aij,l < i,j < k) as in (b), if also Oij = aji for all 
^ ^ hj ^ k, then uniformly over < p < 1/2, 

supP I ^ OijXiXj =r\ = 0{{lpy'^/^) . 

The matrix structural properties we require are precisely those that allow us to 
apply the bounds of Proposition 5.1. For this, the following definitions are germane. 

Definition 5.2. Fix a matrix Q — {qij)i<ij<m- 

• Given S C [m] , we say that j € [m] is an S'-selector (for Q) if \{i ^ S : 

q^.J ^ o}| = 1. 

• Given 2 < b < m, we say Q is ^-blocked if any set S C [m] with S D 
^Rowj-Q-j _ ^^^ 2 < l^l < 6 has at least two S -selectors j, I € [m]. 

The final condition in the definition says that in the sub-matrix formed by only 
looking at the rows in S, there are at least two columns containing exactly one 
non-zero entry. We call j an S'-selector as we think of j as "selecting" the unique 
row i with qij ^ 0. We remark that if a matrix Q is ^-blocked then any set S of 
non-zero rows of Q containing a linear dependency must have size at least 6 + 1. 
More strongly, this is true even after deleting any single column of Q. 

Definition 5.3. We say that Q is 6-dense if 

\{i £ [m] : row i of Q has > 1 non-zero entry}\ > b. 

We then have the following bounds, which are key to the proofs of Lemmas 4.3 
and 4.7. 

Proposition 5.4. Fix integers m > b > 1 and a b-blocked m x m matrix Q with 
rank(Q) < m — z'""'*((5). Then uniformly over < p < 1/2, if y = {yi, ■ ■ ■ ,ym) 
has iid Bernoulli(p) entries then y^ is independent of the non-zero columns of Q 
with probability at least 1 — 0{{bp)~^'^). 

Proof. Let k = rank(Q), and note that if rank((3) < m~ z^°^{Q) then 1 < k < m. 
Write ri, . . . , r„i for the rows of Q. By relabelling, we may assume that ri, . . . , r^ 
are linearly independent and that r^+i is non-zero. It follows that there exist unique 
coefficients ai,...,ak for which r^+i = J2i=i'^i^i- Then {r^ : Ui 7^ 0} U {rfc+i} 
forms a set of linearly dependent non-zero rows, and so has size at least 6 4- 1 by 
the observation just after Definition 5.2. 
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Let Q be the matrix obtained from Q by adding j^ as column m + 1. If y-'" lies 
in the column-span of Q then rank((5) = rank((5), so necessarily 



Vk+i = X! "*2/i 



i=l 



Since |{ri : a^ 7^ 0}| > b, by Proposition 5.1 (a) we have 

P [E«'2^' = 0) = 0((M"'/') = 0((6p)-i/2). 

Therefore, the vector y-^ is independent of the non-zero columns of Q with proba- 
bility at least 1 - 0{{bp)-^/'^). D 

Proposition 5.5. Fix integers m > b > I and a b-blocked, b-dense, m x m matrix 
Q with rank(Q) — m. Then uniformly over < p < 1/2, if :x. — (xi, . . . ,Xm) and 
y = ivii ■ ■ ■ lUrn) have iid Bernoulli(p) entries then P (rank(r((5, x, y)) = ?7i) — 
0{{bp)-'/^). 

Proof. Let A — (aij)i<ij<m be the cofactor matrix of Q; that is, 

a„- = (-l)'+J+idet(Q(''^)) 

where Q^*'-'' is the (i, j) minor of Q. A double-cofactor expansion of the determinant 
of Q' = r(g,x,y) yields 

rn 

det{Q') ^ ^ aijXiijj. 

i,j = i 

Note that a^j — when Q^'^'^' is singular, so we want to lower-bound the number of 
non-singular minors Q^^'^K To do so, fix j g [m] and write Q*-"'-'-' for the mx m~ I 
matrix obtained by deleting the j'-th column of Q. Since Q has full rank it has no 
zero rows. We claim that if Q^"'^) also has no zero rows then \{i S [m] : aij ^ 0}| > 
b. To see this, note that Q^"'^) has rank m—1 and so since there are no zero rows, 
there exists (up to scaling factors) a unique vanishing linear combination 

1=1 

where r^ is the i'th row of Q^^'^\ Now, Q^'^'^^ is invertible (and thus a^j ^ 0) 
if and only if Ci 7^ 0. But the rows {r^ : c^ 7^ 0} are linearly dependent, and 
by the remark just after Definition 5.2, since Q is 6-blocked we therefore have 
\{i G H : c, ^ 0}| > 6. 

Finally, since Q is 6-dense, there are at most b rows of Q with exactly one non- 
zero entry. Thus, \{j € [m] : Q^"'-') has no zero rows}| > b, and for any such j we 
have \{i £ [m] : aij ^ 0}\ > 6 by the preceding paragraph. By Proposition 5.1 (b) 
it follows that, uniformly in < p < 1/2, we have P (det(Q') = 0) < 0((6p)"^/2) 
as claimed. D 

The following proposition is an analogue of Proposition 5.5 which we use in 
analyzing the symmetric Bernoulli process. 
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Proposition 5.6. Fix integers m > b > I and a b-hlocked, b-dense, m x m 
symmetric matrix Q with rank((3) = m. Then uniformly over < p < 1/2, 
if X = (xi, . . . , Xm) has iid Bernoulli(p) entries then P (rank(r((5, x, x)) = to) = 

0{{bp)-^/^). 

Proof sketch. The proof is nearly identical to that of Proposition 5.5. However, in 
this case the double cofactor expansion of det(r(Q, x, x)) has the form YlTj=i o-ijXiXj. 
Consequently, we conclude by applying part (c), rather than part (b), of Proposi- 
tion 5.1. We omit the details. D 

We will apply Propositions 5.4 and 5.5 via the following lemma. 

Lemma 5.7. Fix integers m > b > 1 and an m x m matrix Q for which both 
Q and Q^ are b-blocked and b-dense. Then uniformly over < p < 1/2, if x. = 
{xi, . . . , Xm) and y = (j/i, . . . , y™) have iid Bernoulli(p) entries then 

P (rank(r(Q,x,y)) < rank(Q) + 1 + l[y(Q)>o}]) 
= 0{{bp)-^'^). 

Proof. In what follows we write Q' = r(Q,x,y). Recall that if x and y lie outside 
the row-span and column-span of Q, respectively, then rank((5') = rank((5) + 2. 
Note also that Y{Q) = ^(Q^) always holds. 

If Y{Q) > then by the definition of Y{Q) we have rank(Q) < to - z''°*'(Q) and 

rank(g^) := rank(Q) < to - z™^(g) = to - z^'^XQ^). 

In this case the lemma follows by applying Proposition 5.4 twice, once to Q and y 
and once to Q^ and x. 

We now treat the case Y{Q) = = Y{Q'^). By replacing Q by Q^ if necessary, 
we may assume that Q has s non-zero rows and t non-zero columns, for some 

< t < s < ?n; in particular note that rank((5) = t. By relabelling the rows and 
columns, we may assume that Q' has the form 

/A (y')^' 
0' = (y+)^ 
\x' X- 

where A is an s x i matrix with no zero rows or columns, represents a block of 
zeros, and where x = (x',x~) and y = (y',y^). 

If t = s then A is 6-blocked and 6-dense and rank(yl) = t = rank((5). Since 

rank(Q') > rank(r(yl,x',y')) 

and x',y' have iid Bernoulli(p) entries, in this case the lemma follows by applying 
Proposition 5.5 to A, x' and y'. 

Finally, if i < s then rank((5) — t < s — m — z^"^^' (Q) . Proposition 5.4 applied 
to Q and y then yields that y lies outside the column-span of Q with probability 

1 — 0{{bp)~^^'^). If the latter occurs then Tank{Q') > rank(Q) + 1; this completes 
the proof. D 

The analogous result for symmetric matrices is as follows. 

Lemma 5.8. Fix integers m > b > 1 and a b-blocked, b-dense symmetric ni x 
to matrix Q. Then uniformly over < p < 1/2, if :s. — {xi, . . . ,Xm) has iid 
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Bernoulli(p) entries then 

P (rank(r(Q,x,x)) < rank(Q) + 1 + 1[y(q)>o}]) 
= 0((5p)-i/4). 

The proof is practically identical to that of Lemma 5.7, but is slightly easier as 
for symmetric matrices we always have z{Q) — 2"°*'((5) — z'^'°^{Q). The resulting 
bound is weaker as we must use Proposition 5.6 rather than Proposition 5.5. We 
omit the details. 

To shorten coming formulas, we introduce the following shorthand. 

Definition 5.9. For n > 1, let k ~ k{n,p) — lnln?i/(2p). We say that a square 
matrix Q is n-robust if both Q and Q^ are k-blocked and k-dense. 

The following proposition, whose proof is the most technical part of the paper, 
says that robustness is very likely to hold throughout the final n — n' steps of the 
iterative exposure of minors in i?^„. In the following proposition, recall from the 
start of Section 4 the definition of permissible templates, and also the fact that 
7 G (0, 1/4) is a fixed constant depending only on c. 

Proposition 5.10. Fix K Cz N. For any p G (clnrt/n, 1/2) and any permissible 
template C G M"'{K), we have 



P (V 771 G [n , n] : i?„ p[m] is n-robustj = 1 — O (r 






F (y m £ [n' ,n] : Qf^ pl'n^] is n-robust) = 1 — O ( 



n 



We provide the proof of Proposition 5.10 in Section 6.1, for now using it to 
complete the proofs of Lemma 4.3 and Lemma 4.7 (and so of Theorem 2.4). We 
begin by controlling the probability that z{R^ [m]) ever decreases by more than 
one in a single step of the minor exposure process. 

Lemma 5.11. For fixed K Cz N, uniformly over permissible C G A4"{K) and 
clnn/n < p < 1/2 we have 

P {3n' <m<n : z{R^p[m + 1]) < z{R^p[m]) - l) = Oin-'^/^). 

Proof. First, by symmetry this probability is at most twice 

P (3n' <m<n: |Z--(i?,f,pH)l - |^"™'(i?^,p[™ + l])l > 1 ) • 

For p > 2^^, with high probability |Z''°"(i?^_p[TO])| = for each n' < to < n, so 
we assume that ^-^^^ < P < 20inn ^ rpj^g matrix Rn^p[n'] is distributed as Rn\p, and 
we have p > aclnn' /n' = (1/2 + 7) Inn'/n'. Since R^p contains at most K^ rows 
with deterministic or partially deterministic coordinates, it follows that for n large, 

/2-7/2 

._l/5 

" \ f n „\n'-K 



(|Z--(<pK])|>ni/ 
<e--^'\ ' (6) 



'^',11/2-7/2 
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We next bound the probability that z = |Z"°*'(i?^p[n'])| < 71^/2-7/2 ^^^ ^t least 
two zero rows disappear in a single step. For fixed m, with n' < m < n we have 

P (|Z--«,[m + 1])| < |Z--(i?^,pH)l - 1, ^ < nV2-7/2J 

/l„l/2-7/2jX 

which is at most n^^ ''/^ for n large and p < 20lnn/n. By (6), (7), and a union 
bound, the result follows. D 

We state the symmetric analogue of Lemma 5.11 for later use. 
Lemma 5.12. Under the conditions of Lemma 5.11, if C is sym,m,etric then 

P (3n' < m < n : z^Q^jM) - z{Q'ijym + 1]) > l) < 0(n-''/2). 
Proof. In this case, the desired probability is equal to the probability that 

for some n' < m < n. The proof then follows as that of Lemma 5.11. D 

Proof of Lemma 4.3. For n' < m < n let J-n.m — fd-RupM ■ n' < i < to}) and 
let En.m = {Vn' < i < m : R^ [i] is n-robust}. Note that E'n,™ G J^n.m for all 
n' < m < n. Also, E^^j C En^i for all n' < i < j < n. 

Now for n' < TO < n - 1 let C^ = {rank(i?^p[TO + 1]) = rank(i?^p[TO]) + 1 + 
1\^Y(R'^ [m])>o]}- Then since R^ [m] is J>i ^-measurable and En^m G ^n,rm we have 

■'^^ '^^TTi I ^n,7n) 

>P (C„i I J^n.m) 1[S„.„] 

>inf{P (Cm|i?^p[TO] = Q) : Q is n-robust} • Ij^^^j. 

>(l-0((M-'/'))l[i=;„,„]- 

the last inequality by Lemma 5.7 and since 1[e„ ,„] 1^ 1 [£„„]• Therefore, there 
exists K > such that for all n' < m < n, writing /3 = A'-\/3/(2c)(lnlnn)^/^, we 
have 

P (rank(i?^p[TO + 1]) < R^jAm] + 1 + 1[y„>o] | -F„,™) 

<C(M"'/'l[i?„,„]+l[£S,J 

</3 + l[£;c_J. 

For n' < m < n let /„i = l[rank(fl^^[„+i])<j^£^[„]+i+ij^^^j,j]. It follows from the 
preceding bound that we may couple (/m,n' < to, < n) with a family {B^mn' < 
TO. < n) of independent Bernoulli(/3) random variables such that for all n' < m < n, 

Im < Bjn + (1 — -^m)l[BS „] • 

Finally, for n' < m < n let X^ = 25^ - 1, so that P (X^ = 1) = /3 = 1 - 
P(X„ = -1). By the identity (5) for y(i?„,p[TO+l]), iiY{R^^p[m+l])-YiR,,Jm]) < 
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-max{X„i+i,—Y{Rn,p[m])) then either /,„ > Bm (in which case E^^ occurs) or 
{z{R^p[m + !])< ziR^.p[m]) - 2}. It follows that 

P (Vn' <m<n : Y{Rip[m + 1]) - Y{Rip[m]) < max(X,„+i, -Y{Rn,p[m]))) 

>1 - P {E'^,,) - P (3n' <m<n: z(i?f,p[m + l])/ez(i?f,pH) - 2) 
=1 - 0(n-^/2) ^ 

the final bound by Lemma 5.11 and Proposition 5.10. This completes the proof. D 

The proof of Lemma 4.7 is practically identical, using the second rather than 
the first bound of Proposition 5.10 and using Lemmas 5.12 and 5.8 rather than 
Lemmas 5.11 and 5.7, respectively. We omit the details. 



6. Structural properties that guarantee rank increase 

In this section we prove Proposition 5.10. For the remainder of the paper, fix 
K E N, n G N large, let k = k{n,p) = lnlnn/(2p), and fix a permissible template 
£ = {€+,£-) = {{S+'Uei+ASi)jei-) G X"W. For i e [n] write R, = R^^li], 
Hi ~ H^ [i]. Also for the remainder of the paper, T — [n]\ [I^ U Uie/+ Sf) ^'^d 
let U — Uje/+ ^t ■ These definitions are illustrated in Figure 1. Finally, recall that 
c € (1/2, 1) and a are fixed so that ac G (1/2, 3/4), that 7 = ac — 1/2 and that 
n' — \an\ . 



U^[j 






s: 



r 



y^^eI-^^ - 



> 1 non-zero 
entry per row 








iid 

Bernoulli (p) 

entries 


> 1 

non-zero 

entry per 

column 


iid 

Bernoulli (p) 

entries 


iid 

Bernoulli (p) 

entries 





iid 

Bernoulli (p) 

entries 



Figure 1 . The deterministic and random structure of the matrix i?„ 



Before proceeding to details, we pause to describe the broad strokes of our proof. 
Our arguments are more straightforwardly described in the language of graphs 
rather than matrices, so we shall begin to switch to the language of graphs. 

We separately bound the probability that for some n' < m < n, Rm either 
is not fc(n,p)-blocked or is not fc(n,p)-dense. Bounding the latter probability is 
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straightforward: this is essentiahy the event there are too many vertices with low 
out-degree in iJ„/ , and p > cln n/n is large enough that such vertices are rare. 

Bounding the probability that Rm is not fc(n,p)-blocked for some m is more 
involved, and we pause to develop some intuition. Recall that for i?,„ to be k{n,p)- 
blocked, we need that for any S C [m] with 2 < jS*] < k{n,p), there are at least two 
5'-selectors in Rm- In the language of graphs, an ^-selector is a vertex v such that 
V has exactly one in-neighbour in S. For n' < m < n and i G S, conditional on 
{^H iJ) '■ J ^ ^tJ ¥" *}' the larger the degree of i the more likely it is that N'^ (i) 
contains a vertex v lying outside IJiGS\{i> ^h 0)' ^^'^ such a vertex v is an S*- 
selector. For this reason, low-degree vertices pose a potential threat to the existence 
of S'-selectors. (Indeed, low-degree vertices in a sense pose the greatest difBculty 
for the proof; it is precisely out-degree one vertices that cause Theorems 2.2 and 
Theorem 2.4 to be fail for c < 1/2.) We neutralize this threat by showing that 
with high probability all (sufficiently) low degree vertices have pairwise disjoint 
out-neighbourhoods, and so sets S of exclusively low-degree vertices have many 
S'-selectors. 

We now turn to details. We begin by bounding the probability that some R„i is 
not k{n,p) dense. 

Lemma 6.1. Uniformly over clnn/n < p < 1/2 we have 

P {y n' < m < n : Rm is k{n,p)-dense) = 1 — Oiji' ) , 

and if C is symmetric then 

P (y n' < m < n : (3„ „[?7i] is k{n,p)-densej = 1 — 0{n^ ) 

Proof. For n' < m < n, let 

A„ = {^e [n']\I+ :|([m]niV+^(z))\/-|>l}. 

Observe that A„/ C A^ for all n' < m < n. On the other hand, if Rm, is k{n,p)- 
dense, then \Ara\ > k{n,p). It follows that 

P (3 n' <m< n : Rf^plm] is not k{n,p)-dense) < P (|^„'| < k{n,p)) . 

For i E [n'] \ /+ let Bi be the event that i ^ A„/. The event Bi is monotone 
decreasing, so in bounding its probability from above we may assume that p is 
equal to Pmin = clnn/n. Write s = \I^\ < K. For i <E [n'] \ /+ we then have 

= P (Binomial(n — s,Pmin) < 1) 

1 - Pmin / 

< 2(1 + ac In n)n""'=, 

soE[|[n']\yl„.|] < n'-2(l+aclnn)n-"'= < 2ni-"'=(l+ac Inn). A similar calculation 
shows that E [|[n'] \ An-I"*] < (2ni-"^(l+a In n))'*16(l-Fac Inn) and so by Markov's 
inequality 

(n' — k{n,p))^ (n' — k{n,p)p n^ 
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the last inequality holding for n large since 1 ~ ac < 1/2 and n' — k{n,p) = Q{n). 
The lemma follows by a union bound. An identical proof establishes the stated 
bound for Q^ in the case that C is symmetric. D 

Next, we address the probability that the minors Rm, n' < m < n, are not 
fc(n,p)-blocked. The next definition allows us to avoid the (partially) deterministic 
neighbourhoods of H,n ■ 

Definition 6.2. Fix b>2. A matrix Q = {qij)i<i.j<m "is (6, £)-blocked if any set 
S C [m] with 2 < |S'| < & satisfies the following conditions. 

• //5'n(Z«°*(g)U/+) = then there exist distinct j, I e [m]\{I' U^e/+ -5"+) 
that are S -selectors for Q. 

• IfSn (Z™^((5) U/") = then there exist distinct j, I £ [m]\ (/+ [Jiei- ^D 
that are S -selectors for Q^ . 

(In the last bullet of the preceding definition, it may be useful to note that if M 
is the adjacency matrix of a directed graph then M^ is the adjacency matrix of the 
graph with all edge orientations reversed.) We then have the following lemma. 

Proposition 6.3. Uniformly in c\nn/n < p < 1/2, 

P (V?i' <m<n:Rm is {k, C)-blocked) = l-0 {n-"') , 

and if C is symmetric then also 

P (Vn' <m<n: Qr,,p[m] is {k, C) -blocked) = 1-0 {n'^) . 

The proof of Proposition 5.10 assuming Proposition 6.3 is straightforward and 
largely consists of showing that if Rm is {k, £)-blocked then most sets S G [m] \ 
Z^°'"{Rm) deterministically have at least two S'-selectors (even ii S D I~^ 7^ 0)- An 
easy probability bound then polishes off the proof. 



Proof of Proposition 5. 1 0. Let 

cr = {^ c H \ z''°"(i?„ 

C2™ = {E(Z [m] \ Z''°'"{R,, 
CT ^{Ed [m] \ Z«°«'(^r, 



2< 1^1 <fc(n,p),£;c/+}, 

2<\E\<kin,p),\E\I+\>2}, 
2<\E\<kin,p),\E\I+\^l}, 



and for k = 1,2,3 let A'^' = {V-B e Ck : there are two £'-selectors in T n [m]}. 
Note that if Rm is (fc,£)-blocked for all n' < m < n but is not fc-blocked for some 
n' < m < n, then one of the events A™, k £ {1,2,3}, n' < m < n must fail to 
occur. We consider the events A™, k — 1, 2, 3 in turn. 

First, note that since the sets {S^,i G /+) are disjoint and non-empty, for 
every E e C™, for all i e E, every j £ S^ is an i?-selector. Thus, A™ holds 
deterministically. 

Second, if -R„i is (fc,£)-blocked then for any E £ C™ there are (i?\/+)-selectors 
^1,^2 G T n [m]. Since lJie/+ ^t i^ disjoint from T, it follows that ^1,^2 are not 
in the out-neighbourhoods of any vertex in I"*". Therefore £1 and £2 are also E- 
selectors, so if Rm is (fc,£)-blocked then A™ holds. It follows by Proposition 6.3 
thatP(n:,=„,^r)-l-0(n-'^). 

Third, fix iJ e C™ and write i? \ /+ = {v}. Note that v must have at least 
one neighbour in Hm as E n Z'^°^'{Rm) = 0- If |A^h„(w) n T| > 2 then any two 
hj2& N+ {v)r\T are £:-selectors since N+ (/+)nT = 0. Also, if | A^^ {v)nT\ > 1 
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and N+^^ (v) n N+^ (/+) = then choose d G 5+ C N+^ (/+) for some ieEr\I+, 
and £2 e N^ {v) n T; both £1 and £2 are again ii^-selectors. 
It follows that 

P( U i^Tr) <P{3ve[n]\I+:\N+Jv)nU\>l,\N+Jv)nTn[n']\<l). 

(8) 



^771— n 



For fixed v £ [n]\ /+, since |[n'] n T| > n' - K{K + 1), we have 

P(\N+ ,(u)nTn K]| < 1] <P {BinomiaAin' - K{K + 1) - l,p) < 1) 

<(l + nV)(l-p)"'-(^+i)' 
Furthermore, 

p{\N+^{v)r]U\ > 1) <-ft:V 

The events in the two preceding probabilities are independent since U and T are 
disjoint. By a union bound over v G [n] \ /+, it follows that the probability in (8) 
is bounded by 

n • (1 + n'p)(l - p)"'-(^+i)' . K^p. 

li p > 41nn/(an) then (1 - p)"'-(^+i)' < e-P("'-(^+i)') < n^^ for n large, 
proving the result in this case, li p < 4:\nn/{an) then np < (4/Q;)lnn, and since 
p > c\nn/n and n' > an, this expression is bounded by 

if 2(1 + (4/q,) lnn)2(l - p)-(^+i)%-P"' = 0(\n^ n/n"") . 

Since ac = 7 + 1/2 we conclude that 

P(3we [ri]\/+ : |iV+Jw)nC/| > l,\N+Jv) HT n[n']\ < l) == ©(n"'^) . (9) 

Combining this bound with our bound on P (nm=n' ^2^) ^'^'^ '^'^^' deterministic 
observation about the events A^, it follows that 



P (Vn' < m < n : iT!„ is fc-blocked) = 1 - 0( 



n 



-7-1 



A symmetric argument for i?^ shows that 

P (Vn' <m<n:R^is fc-blocked) = 1 - 0(n-^) , 

which completes the proof of the first assertion of the Proposition 5.10 The second 
assertion of the proposition follows by a practically identical argument using the 
second bound of Proposition 6.3 (the only difference is that in this case there is no 
need to conclude by "a symmetric argument" as Q^^p[m] = (Q^^p[™])'^). □ 

The remainder of the paper is devoted to the proof of Proposition 6.3. 

6.1. Proof of Proposition 6.3. First, suppose £ is symmetric and write D = 
I- U U,e/+ "S*/ = /+ U U,g/- S~. Then for n' < m < n, Q^,p[m] is (fc,£)-blocked 
if and only if (3n,p[['7i] \ D] is fc-blocked. It follows that 

P (Vn' <m<n: Qn.p[m\ is (fc, /:)-blocked) 

= P (Vn' < m < n : Q„,p[[7n] \ D] is fc-blocked) 

= 1 - 0{n-^) , 

the last bound by Lemma 2.10 of [7] (note that in that paper, the first property in 
the definition of "good" is equivalent to our property "fc-blocked" ) . This establishes 
the second assertion of the lemma, so we may now focus exclusively on the first. 
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We say a set £■ C [m] \ /+ is blocked if Tn [m] contains two distinct S-selectors. 
Given n' <m<n and 2 < s < k(n,p), let 

D„,^, = {3^; C [m] \ (Z'""''(i?,„) U /+) : \E\ = s, E is not blocked} . 

To prove Proposition 6.3 it suffices to show that 

n k{n,p) \ 

y y Dm^s - 0{n-^). (10) 

^7n—n' s—2 j 

Since v! = 6(7i), our arguments are mostly insensitive to the value of m. The value 
of s plays a more significant role, and we tailor our arguments for different values. 
The region where 1/pln ' n < s < k{n,p) is rather straightforward; fix such s 
and n' <m<n, and fix S C [m] \ (Z''™'(^m) U /+) with \E\ = s. Then for fixed 
jG[m]\(/-UUe/+5+), 

P (j is an £;-selector) = P {\Nj^Jj) D E\ = l) = sp{l - p)"-^ > spe-'P . 

These events are independent for distinct j e [m] \ (/^ U Uie/+ ^t)^ ^^^d it follows 
that 

P (E is not blocked) < P (Bin(TO - K{K + 1), spe'^P) < l) < n(l - spe-'P)"/^ , 

the last inequality since m—K{K+l) > n/2 for n large. Since (") < exp(s ln{ne/s)), 
it follows by a union bound over E C [m] \ /+ that 

P iD„,^s) < exp(sln(ne/s) + lnn)(l - 5^6-"^)"/^ 

< exp(s ln(ne/s) + Inn — nspe^'^P /2) 
= exp(lnn + s(ln(ne/s) — npe~*P/2)) 

< exp(lnn + s(ln(npeln ' n)—np/h\' n)) , (11) 

the last bound following since (pin ' n)~^ < s < ln\nn/{2p) = k{n,p). Using that 
x/y > 2\n{xy) when x > y'^/2 > 2e^, since np > c\nn > (In ' n)^/2 it follows 
that np/ In ' n> 21n(npeln ' n) for n sufficiently large, and so (11) yields 

P {Dm,s) < exp(lnn — snp/(21n"'^" n)) < exp(lnn — n/(21nn)) , 

where in the final inequality we use that s > (pln^" n)~^ . A union bound and the 
fact that {n-n' + l)(lnlnri/(2p) - (pln^^^ n)-^) < n^ then yields 

y y I?„.,J <exp(31nn-n/(21nn)). (12) 

yn'<m<n (plni/2 „)-i<s<ln In n/(2p) / 

This takes care of the range (pin ' n)^^ < s < k{n,p), which for small p is the 
lion's share of the values of s under consideration (though the smaller values of s 
require slightly more work). 

For 2 < s < l/(pln^"n), our approach to bounding P {D„i^s) is based on the 
pigeonhole principle and a simple stochastic relation, and we now explain both. For 
convenience set n = n' — K{K + 1) , and note that |T n [m] \ > n. Note that for a set 
E C [m]\(Z''™(iJ,„)U/+) if E is not blocked then at most one vertex in N^ {E)nT 
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has less than two m-neighbours in E, so J2ieE \^h^ («) ^ T| > 2| A^^^ (£;) n r| - 1. 
It follows that 

P {E is not blocked) < P ( \N+^^ {E) n r| < ^^eE\K^i^)^T\ + l) \ ^^^^ 



Second, the number of distinct objects obtained by sampling with replacement is 
always smaller than when taking the same number of samples without replace- 
ment. It follows that conditional on Y^ieE I^h,„(*) '^ ^1' ^^^ ^i^^ \N^^{E) n T\ 
stochastically dominates \S\^ where S" is a set of X^iefi \-^h (*) ^ -^1 independent, 
uniformly random elements of T. On the other hand, for such S and for fixed 
h < |T n [to]|, if I^iGB l^if (*) n r| = 6 then |5| stochastically dominates a 
Binomial(6, 1 — 6/|r n [m]\) random variable. It thus follows from standard Bi- 
nomial tail estimates (see Proposition A.l) and the fact that |Tn [m]\ > n, that if 
n > 46 then 



ieE J 



pl\N+jE)nT\<{b+l)/2 

< P ( Binomial(fe, b/n) > (6 - l)/2 ) 

< exp(^-^log— j. 

We note that this upper bound is decreasing in b for b < n/(4e^), as can be 
straightforwardly checked. With (13), this yields 



P [ Y^ \N+^^ {i) r\T\^b,E is not blocked j 
\ieE I 



< pf^|7V+^(^)nTH6J.exp(^-^log^j , (14) 

from which Proposition 6.3 will follow essentially by union bounds and Binomial 
tail estimates. Some such estimates are encoded in the following straightforward 
bound, whose proof we defer to Appendix B. 

Lemma 6.4. Let G be the event that for all m G [f^',f^] and all E C [m] \ 

(Z«°«'(^™) U /+), it is the case that \E\ < J^ieE l^ff,„(«) n T| < n/iAe^). Then 

P{G) = l-0{n-'). 

Now fix £; C [m]\/+, and write s = \E\. Then J2ieE l^^„(*)l^^l stochastically 
dominates a Binomial(ns,Pmin) random variable (writing Pmin — clogn/n). It 
follows from the binomial tail bounds stated in Proposition A.l that 

/ \ / ~ \ 20s / , \ 20s 

/ enpmin \ (etc log n 



P(gl»iJ»nT|.20,j,(^^;5^j ,^^^j , ,15) 
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the last inequality holding for n large since n ^ n' — K{K + 1) = an — K{K + 1) 
and so e"^""" > (e/20)e"'P»'" = (e/20)ri"^ Next, observe that 

P [E is not blocked, G) (16) 

- -^ ( * - X! I^ff™ (*) ^T\< 20s, E is not blocked ] 
\ ieE ) 

+ P ( 20s < ^ |7V+^^ (i) n r| < hl(^e^),E is not blocked j . 

Taking a union bound over sets E C [m] \ /+ with |i?| = s and using the bounds 
(14) and (15) in (16), it follows that 

P(An,.,G) 



< 



1 n \ l' eaclogn 



m\ I s ^ 

exp(-^log-^^^^ V n-/20 



Jexp(-4slog — 



(A) (B) 

Using that n > n/2 for n large and that C™) < {en/sY , we have 



<-)^(£:)' 



1/2 /(2e)3/2 (eaclogn) 



20 



< 



(2e)^/2 (eaclogn) 



20 



„ac-l/25l/2 / ~ V 16n7T/Ssl/2 

For s > 4/7 = 4/(ac- 1/2), we then have ^("^-1/2)5-1/2 ^ „7s-i/2 > ^77/8^ g^ ^j. 
such s and for n large, 

•(2e)3/2(eaclogn)20^ " 



(A)< 



,77/8 



Again using that n > n/2 for n large and that (™) < {en/sY , we have 

The preceding bounds on (A) and on (B) are decreasing in s for 4/7 < s < 
(pln^'^ "■)~^, as can be verified by differentiation; it follows that 



PjGn U U Drn,s 

"'<™<"4/7<s<(plni/2 „)-l 

Y^ J2 [(A) + (B)] 



< 



4/7<s<(plni/2„)-l n'<m<n 



<(n-n' + l)(pln^/2n)"i 



4(eaclogn)^ 



20 \ 4/7 



,77/8 



4^5,'^ /„,\3\ 4/7 



/ 80^65(4/7) 



<n^ 



44/T (eac log n)«"/T 0(1) 



,7/2 



7I2/7 



=0(n-i). 



(17) 



since 7 = ac - 1/2 <G (0, 1/4). 

We now treat the range 2 < s < 4/7; for this we require a final lemma. We 
say Hn is well-separated if for all n' < m < n, for any distinct m, w G [m] \ I^ , if 
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l^H (")l — In Inn and \N^ {v)\ < In Inn then there is no 2-edge path (with edges 
of any orientation) joining u and v in Hm ■ 

Lemma 6.5. P (if„ is well-separated) = 1 — 0(n^^). 

We defer the proof of Lemma 6.5 to Appendix B, as it is essentiaUy a reprise 
of an argument found in [7] (though the resuhs of [7] do not themselves directly 
apply). 

Fix n' < m < n, 2 < s < 4/7 and E C [m] \ /+ with \E\ = s. Write G = 
G n {H„i is well separated for all n' < m < n}. Arguing as at (16), we have 

P (e is not blocked, G 
- ^ i^-Yl I^H™ (*) ^ ^1 ^ 20s, E is not blocked, G ] 
+ P [ 20s < ^ |iV+^ (i) n r| < n/(4e2), £; is not blocked ] . 

Now note that if YjteE l^ff„(*) ^1 ^1 < 20s < 8O/7 then all vertices in E have 
degree at most 8O/7 + |[m] \ T| < 8O/7 + K(K + 1). For n large enough that 
8O/7 + K{K + 1) < Inlnn, if H„ is well-separated then the sets {\N^^{i) n T\ : 
i e E} are disjoint. Since s > 2, it follows that in this case E is blocked so the 
first probability on the right hand side of the preceding bound is zero. By a union 
bound and the same argument used to bound (B), above, it follows that 

PlGn U U Dra.s 

\ n'<m<n2<s<4/-f 

, 4 /80V23^' 
7 \ n-' 

Combining this bound with (12), (17) and Lemmas 6.4 and 6.5 then yields 

P( U U ^rn, 

\ji'<m<n 2<s<ln In n/(2p) 

< exp (3 Inn - n/(21nn)) + ©(n'^) + 0{n-^) + 0{n-'') + 0{n-'') 
=0(n-'<) , 
which, recalling (10), completes the proof of Proposition 6.3. D 
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Appendix A. Binomial tail bounds 

In this section we recall standard binomial tail bounds. The bounds in the 
following proposition are contained in [17], Lemma 1.1 and [14], Theorem 1.1. 

Proposition A.l. For ?7i e N and < g < 1, if X = Binomial(?TT,, q) then writing 
H = mq, for k > n we have 

P{X>k)< exp(-^ - k \n{k/{en))) , 
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for k < jj, we have 

P{X <k) < exp(-^ - k ln{k/{efj.))) , 
and for s > we have 

P (X - /i > em) < exp(-2e^m) , P (X - ^ < -em) exp(-2e^m) . 

Appendix B. Remaining proofs 

Proof of Lemma 3.3. Fix c > and let a > 1 large enough that 1 — e~^ < e/4 
and that 6(e/4)'=° < e/2. Recall that pi = ^-'^^ and P2 = ^^^ft^. 

Observe that r('H„) > P2 if either Z''°"" [H^^p.,) or Z™^{H„^p^) is non-empty. As 
both \Z^°'"{Hn.p2)\ and |Z'-'°^(iJ„^p2)l ^-''^ asymptotically Poisson(e""), our choice 
of a yields that 

P (t(H„) > P2) < 2 (l - e-'^"") + o(l) < I + o(l). 

This gives the first bound of the lemma. For the second bound, we claim that it 
suffices to prove P (2?_r- (pi , P2 ) ) > 1 — £/2. 

Indeed, assuming this bound, since 'DK{pi,P2)r\{T{'Hn) = P2} C 2?_r-(pi,t('H„)), 
we have 

P (2?if (Pi, t(H„))) > P i^KiPi,P2)) - P (t(H„) >P2)>l-e + o(l). 
Given i,j G [n], if C/^j > pi, then e = ij ^ Hn.pi- We have 

P (e e i?„,p, I e ^ iJ„,pJ = P ((7,, < P2 I C/,, > pi) < Pl^Pl- < ^. 

i pi n 

We use this estimate to study {N^ (*))iezf'°"(ff„ )■ For i,j G [n], by the pre- 
ceding bound and a union bound. 



[n+^^^ {^) n N+^^^ {j) ^ I z, J- e z--(i/„,j) < 



n{Aaf (4a)2 



By another union bound, for any fixed set Z+ G [n] it follows that 



P 



{{^H^p^{'^)}tez+ are not pairwise disjoint | Z""^' {Hn,p^) = Z+\ < 



n 
(18) 

Similarly, for any fixed Z+ , Z G [n] 

P f U <... (0 n z- ^ I z-"(H„,,J = z+, z™-(i/„.,J = Z-) < ^^\^1M: 

\iez+ J 

(19) 
Finally, given that i £ Z^°'*' [H^.p^) we have \N^^^ {i)\ ^st Bin(n,4a/n). It follows 
by a Chernoff bound that 



'P[\Nl^J^)\>Sa\^eZ-°'^■{H,,,p,)) 
By a union bound, for any Z-\- C [n], 



< e-i6a^ 



( U K...(«)l > 8« I ^"™(^n,pj - Z+\ < \Z+\e-^^'^' 
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Observe that, a similar argument conditioning on Z^^^^Hn.p-i) = Z^ gives the 
same bounds in (18) and(20) for the sequence {N^ (*))iez™'-(^.> )■ Additionally, 
U,ez-w(H„„j A^l,,, (0 n Z- ^ implies {J.ez^o.^Z^,^-, iV^„„, (0 n Z-- ^ 0. 

So far the bounds obtained depend on the size of fixed sets Z+, Z~ G [n]. Now 
let K. be the event that both Z'"™(iJ„_pJ and Z^°^{Hn^p^) have size at most K = 
K{a) = [2e°J . We claim that 

P (/C) = P (|Z«™-(^n,pJI, l^'°'(^n,pJI < ^) > 1 - 2(e/4)^° + o(l). (21) 

For this, we use that if X = Poisson{X), then for a; > A, 

P(X>.)<^'(^^)^ 



In particular, P {X > 2A) < (e/4)^. Since |Z«°*(i7„,pJ|, |Z™^(i7„,pJ are asymp- 
totically Poisson(A), (21) follows by a union bound. 

We now bound 'Dk{pi,P2) using the above inequalities. If K. occurs, then there 
exist (possibly empty) sets Z^^Z~ € [n] of size at most K. By (18), (19), (20) we 
then obtain 

P {Vk{pi,P2) U {t(H„) <Pi}\K.)>l- 2/^e-i6« . (22) 

Note that if t(-H„) < pi then |Z^°«'(i/„,pJ| + |Z™^(iJ„,pJ| = 0. Thus, 

P{Vk{pi,P2)) 
>P {Vk{pi,P2) U {r(H„) < pi} I /C) P (/C) - P (r(H„) < pi) 

>(1 - 2ii:e-i6a^ - o(l))(l - 2(e/4)'=°) - 2e-"° + o(l) 

>l-6(e/4)"°. 

In the second inequality we use (21), (22) and the fact that \Z^"^''' {Hn.p)\ and 
\Z^'-™{Hn^p)\ are asymptotically Poisson(e''); the final inequality then follows from 
the fact that a > 1 and K — [2e"J by straightforward calculation. By our choice 
of a, the final bound is at least 1 — e/2 + o(l), completing the proof. D 

Proof of Lemma 6.4. We first bound the maximum degree of vertices in [n] \ /+. 
A union bound together with a Chernoff bound yields 

P {3v e [n] \ 1+ : \N+Jv)\ > 2np) < ne'^^P'^' < n^-"'''^", 

where the last inequality uses that p > Pmin — clnn/n. It follows that with high 
probability, for any n' < m < n, 

^|iV^^(i)nr| < 2\E\np< 2n/ In^/^ n < n/4e^ 

To obtain the lower bound note that 

J2\N+J^)nT\>Y,\N+J^)nTn[n']\. 

i£E ieE 

Thus, it suffices to show that 

P(Vwe [n]\I+ : \N+^{v)nTn[n']\ ^ Hi) >l-0{n-''). 
For fixed v G [n] \ /+, since \[n'] nT\>n'- K{K + 1), we have 

P(|7V+Jt;)nTnK]|=0) < (l-p)"'-(^+i)' < Ce-"^" = 0(n-^-i/2). 
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The bound above applies in particular for vertices in Ui^j-S^, and there are at 
most K^ — 0(1) such vertices. On the other hand, if w e [n] \ {Ui(zi-S^ U /+) and 
N+Jv) n [n'] ^ 0, then either \N^^(v) n T n [n']| 7^ or \N^^{v) n C/ n [n']| 7^ 0. 
It thus remains to bound 

p(3we N\/+ : |iv+jw)nc/| > 1, |iV+Ji;)nrnK]| =o), 

which is 0(n-'i) by (9). D 

Proof of Lemma 6.5. We say a vertex v has low degree in Hm if N^ [v) < d := 
In Inn. 

First consider the graph iJ„' . The event that fixed vertices Vi and V2 are con- 
nected by a 2-path is monotone increasing, while the event that both vertices have 
low out-degree is monotone decreasing. By the FKG inequality, these events are 
negatively correlated and so the probability that both events hold is bounded from 
above by the product of their probabilities. 

The random variable |A'^^ ^ (^i)! is Binomial(n' — l,p) distributed; we will bound 

■P ( \-^H ,(^^1)1 ^ '^J- In doing so we may again assume that p takes its minimal 
value of Pniin = clnn/n, since this probability is decreasing in p. It follows that for 
n sufficiently large, 

P{\N+^Jv,)\<d) = ^ ("'^^)(p„.i„)Xl-Pmin)"'-'-^ 

i=Q ^ ^ 



< (1-Pmin)"'£2(2n'pmin)' 



i=0 



< 7i-""(21nn)'^+\ 

where in the first inequality we use that 1 — Pmin > 1 ~ P > I , and in the third we 
use that Pmin = c\nn/n. The random variables |iV^ ,(wi)| and \N^ ,(^2)! sa-c iid 
and hence 



\ ('9 1n ■n'l2d+2 l-n3<i, 

\Nl,M\<d,\Nijv,)\<d) < (^i^jL < '^; 



Also, the same bound holds for P (|A'^^ (wi)| < d, \N^ {^2)] < d) for any m e 
[n',n], since |A^^ (^i)l and \N^ (^2)! are increasing in m. 

On the other hand, the probability that vi and V2 are adjacent or have a common 
neighbour is at most 2p„^ia + Anp^^:^^. By a union bound, the probability that _ff„' 
contains two low degree vertices with a common neighbour is therefore at most 

n'\ In^'* n(2p,,i„ + 4np^iJ ^ 6 W''+^ n 



2) ni+27 - n?i ■ 

We next consider Hm with m > n' . Let z = to be the unique vertex of Hm not 
in iJ.,„_i. If Hjn is the first graph which is not well separated, then it either (a) z is 
adjacent to two low degree vertices ui, V2 which are neither connected by a two-edge 
path nor adjacent in i?m-i or (b) z is itself a low degree vertex at (undirected) 
distance 1 or 2 of a second vertex vq of low out-degree. By a union bound over the 
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pairs of vertices in Hm-i wc obtain that (a) occurs with probabihty at most 

m-l\4p2 ln3''„ oi„3d+2, 



2 J n^+^-i - ni+27 

Similarly, since z and any fixed vertex wq G [m — f ] are at distance f or 2 with 
probability less than 2p^i^ + 4np^;„, a union bound over the vertices of H^-i 
implies that (b) occurs with probability at most 

(m - f ) In^'^ n(2p^in + Anpj,^) ^ 6 ln'^^+^ n 

n^+2t - r^i+27 ' 

Combining these bounds and summing over to G [?i', J^], we obtain that 

P (iJ„ is well separated) == 0(ln^''+^ n/n^^) , 
and the latter term is 0(n^'^), as required. D 
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